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Abstract: 



Econometric cost functions have begun to appear in education adequacy cases with greater 
frequency. Cost functions are superficially attractive because they give the impression of 
objectivity, holding out the promise of scien tifically estimating the cost of achieving specified 
levels of performance from actual data on spending. By contrast, the opinions of education 
stakeholders form the basis of the most common approach to estimating the cost of adequacy, the 
professional-judgment method. The problem is that education cost functions do not in fact tell us 
the cost of achieving any specified level of performance. Instead, they provide estimates of 
average spending for districts of given characteristics and current performance. It is a huge and 
unwarranted stretch to go from this interpretation of regression results to the claim that they 
provide estimates of the minimum cost of achieving current performance levels, and it is even 
more problematic to extrapolate the cost of achieving at higher levels. In this paper we review 
the cost function technique and provide evidence that draws into question the usefulness of the 
cost function approach for estimating the cost of an adequate education. 

The authors wish to acknowledge the support of the Missouri Show -Me Institute. The usual 
disclaimers apply. 
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Introduction 



Econometric cost functions have begun to appear in education adequacy cases with greater 
frequency. While previously considered too technical for courts to understand, recent litigation 
in Missouri featured separate cost function estimates commissioned by each of two plaintiffs. A 
prior Texas court case presented results from two dueling cost studies commissioned by the 
opposing sides. 1 This increased use of the cost-function methodology likely reflects growing 
skepticism about other methods typically used to estimate the cost of providing an adequate 
education. In particular, the “professional judgment” (PJ) method has begun to lose favorC In 
this approach, panels of educators design prototype schools that they believe will provide 
adequate educational opportunities and then the consultants hired to conduct the study attach 
costs to these prototypes. Even a sympathetic trial judge in Massachusetts concluded that the PJ 
study submitted there was “something of a wish list.” Hence, although PJ studies are invariably 
included, recent finance cases have attempted to bolster these with econometric cost functions. 

Cost functions are superficially attractive because they appear objective, holding out the promise 
of scientifically estimating the cost of achieving specified levels of performance from actual data 
on spending instead of relying on opinions, as do professional-judgment estimates. In keeping 
with this perception, a group of education finance specialists began arguing that econometric cost 
functions are the most scientifically valid method to determine the cost of adequacy. To make 
this argument, they asserted that the methods for estimating cost functions in the private sector - 
where competition tends to drive out inefficient producers - could be readily adapted to public 
education. They prepared estimates for legislative committees and courts in states such as New 
York, Texas, Kansas, and Missouri, and published their work in academic journals. The 
problem, we shall argue, is that education cost functions do not in fact tell us the cost of 
achieving any specified level of performance, as claimed. 

This is not to say that cost functions tell us nothing. They do provide estimates of average 
spending for districts of given characteristics and indicate how spending varies by these 
characteristics in the specific state. For example, they may tell us that in state X, per pupil 
spending averages Y thousand dollars for districts with a certain percent of free-lunch or 
reduced-price lunch eligible (FRL) students or of black students and that the average rises or 
falls by Z dollars as these percentages change. Regression equations provide a useful summary 
of such patterns. By extension, including measures of performance (e.g., test scores) as a 
variable permits summarizing what the average spending is for districts with given demographics 
and performance levels. 

However, it is a huge and unwarranted stretch to go from this modest interpretation of regression 
results to the far more extravagant claim that these provide estimates of the cost of achieving any 
given performance level for districts of given demographics. There are two key heroic 
assumptions that are required: (1) the estimates of average spending among comparable districts 



1 Imazeki and Reschovsky (2004b), Gronberg, Jansen, Taylor, and Booker (2004). 

2 The alternative methods are discussed in Ladd, Chalk, and Hansen (1999). Particular attention to the use of cost 
functions can be found in Gronberg, Jansen, Taylor, and Booker (2004), Duncombe (2006), and Baker (2006a). 
Critiques can be found in Hanushek (2006, 2007). 

3 Costrell (2007), p. 291. 
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can be adjusted so that they reflect the minimum efficient cost 4 to generate current performance 
levels, and; (2) the estimated variation in average spending across districts with different 
performance levels can be used to extrapolate the costs of raising performance to levels not 
currently observed by comparable districts. 

As we will show in this paper, the method typically used to convert average spending figures 
into estimates of efficient cost accomplishes nothing of the sort. For that reason, there is no 
foundation for interpreting spending variations across districts with different demographics as 
the required spending premiums for demographic groups. Finally, the estimated relationship 
between “cost” and performance is highly unreliable - it is typically estimated with huge 
imprecision, wide sensitivity to model specification, and by methods that often fail to eliminate 
statistical bias. As a result, the cost estimates for raising performance to target levels have no 
scientific basis. 

None of this should be surprising. The recent push for experiments in education research is just 
one of many indications of the difficulty of estimating the effects of resources on student 
learning. Why would we need experiments if we could just use average district spending and 
average student test scores, as do cost functions, to estimate the effect of resources on 
achievement? Decades of research have repeatedly failed to find a systematic empirical 
relationship between average spending and performance. It would be quite noteworthy if a 
handful of recent spending equations were to suddenly have found a relationship that had eluded 
decades of previous investigation. This simply is not the case. The deeper reasons for this and 
the consequences thereof are the subject of this paper. 



The Basic Problem: The Cloud 



The logic behind regression-based estimates of the cost of adequacy is seemingly compelling. 
Why shouldn’t we be able to use data on district spending and student test-performance to 
estimate the costs of achieving a given outcome goal? 

The dimensions of the difficulty with this are easiest to see by looking at the simple relationship 
between spending and performance. Figure 1 shows a plot of spending and performance in 2006 
for the 522 districts of Missouri. The vast majority of districts lie in a solid cloud of spending 
between $5,000 and $8,000 per student and getting average achievement on the Missouri 
Assessment Program (MAP) tests between roughly 700 and 800. At virtually any spending level 
in the range of $6,000-8,000 there are some districts below 700 points and some over 800. This 
blob of data illustrates the two dimensions of the difficulty referred to above: (i) average 
spending differs greatly from minimum spending at any given performance level; and (ii) there 
is no apparent association between average performance and average spending in this group. 

There is a smaller number of districts spending over $9,000 but still no obvious pattern of being 
high or low on the math tests. Additionally, the size of the circles indicates the student 



4 To an economist, this is a doubly redundant phrase, since “cost” implies efficiency, which in turn implies 
minimum spending necessary to achieve a given outcome. Since this usage may not be universal, we use this 
phrase for clarity and emphasis. 



January 3, 2008 



3 




Education Working Paper Archive 



populations. Some large districts are above average in performance, while others are below 
average. The two large and high spending districts that stand out are Kansas City and St. Louis. 
Both are noticeably below average in student performance. 

Taking all the districts together, the line in the picture shows that the simple relationship between 
spending and achievement is essentially flat. Even if, on average, there is a small relationship 
between average spending and average achievement, either positive or negative, the relationship 
is very weak. That is the fundamental challenge. How can one project the spending necessary to 
improve student performance to any level when the available data show little tendency toward 
higher achievement when given extra funds? 

Districts of course differ in a variety of dimensions other than spending, leading to a 
considerable amount of analytical effort to control for other factors in order to uncover any 
systematic influence of spending. The basic question is whether other factors that might affect 
performance, such as poverty levels, can be used to sort districts out of the cloud of Figure 1 
such that a pattern with spending emerges. Extensive efforts to do this, beginning with the 
“Coleman Report” (Coleman et al. (1966)), have been quite unsuccessful. These efforts, 
generally labeled estimation of production functions, have concentrated specifically on different 
backgrounds of students and have attempted to standardize for family inputs that are outside the 
control of schools. 5 



The Cost Function Approach 

The estimation of cost functions approaches this problem in a slightly different manner than most 
research exploring the relationship between spending and achievement. It focuses on how 
achievement levels determine spending, as opposed to how spending determines achievement. 
When put in terms of the determinants of spending, other things logically enter the analysis. 

First, districts might differ meaningfully in the prices that they face for inputs, particularly 
teachers. The price for teachers and other college graduates can be quite different for one district 
than for another because of the labor markets in which they compete. If districts must pay higher 
prices to obtain the same quality of resources, then omitting price differences could bias the 
estimated relationship between achievement and spending. Second, cost functions, similar to 
production functions, must account for possible variation in resource needs arising from students 
who have fewer resources at home and thus may require more resources at school, on average, to 
achieve the same level of performance. Again, if need differences are omitted from cost 
functions, the estimated relationship between achievement and spending may be biased. Third, 
districts may differ in the efficiency with which they use their funds. Two districts with similar 
spending, similar prices and similar needs might achieve quite different outcomes, based on the 
efficiency with which they use their dollars to produce the outcome in question. To isolate cost, 
these estimates must address differences in efficiency. 

The underlying premise of the cost function estimation is that correcting for price differences, 
the demands of different student bodies, and the efficiency of district spending will yield a clear 
relationship between achievement and the spending that is required to achieve each level of 



5 Hanushek (2003) 
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performance. This relationship then permits identifying the spending required to achieve any 
chosen level of student achievement. 

Do these corrections work? 

To answer this question, we trace through some specific cost function analyses. We focus on 
those submitted in the Missouri court case because the data were readily availability for purposes 
of replication and analysis. 6 However, the issues identified here apply to the entire genre of cost 

7 

functions based on the “efficiency control” approach. 

o 

Figure 2 provides a similar picture to that previously presented. The performance measure, for 
2005, is a composite of each district’s performance on the state assessments — specifically the 
percent of students in the top two categories (out of five) on the math and communications arts 
exams across three grades. Unlike Figure 1, this figure places spending, to be determined by 
achievement and school factors, on the vertical axis. Figure 2 again shows there is a wide range 
of spending observed at any given level of performance. 9 As a result, the line fitted through 
these data exhibits a very weak relationship. 10 

What a cost function tries to do is to go beyond this simple (weak) relationship to estimate for a 
district of given characteristics the minimum expenditure required to meet some target 
performance level . This can be logically broken down into three steps in constructing the cost 
estimates: 

1) Control for district characteristics, so that “likes” can be compared with “likes.” As 
mentioned, one reason for the wide range of spending is that districts differ in 
characteristics, such as demography, school size, input prices, and variables thought to 
affect efficiency. The variation in spending among districts with comparable scores is 
partially related to these differences. Cost functions statistically control for 
demographics and other district characteristics with the conventional technique of 
multiple regression, discussed in the next section. 



6 Baker (2006b), Duncombe (2007). Baker was retained by the main group of plaintiff districts, the Committee for 
Educational Equality, and Duncombe was retained separately by the City of St. Louis. For the defense, Costrell was 
retained by the Attorney General of Missouri and Hanushek by the Defendant Intervenors (Shock, Sinquefield & 
Smith). 

7 In addition to some of the studies cited above, a partial list would also include, Duncombe and Yinger (1997, 2000, 
2005, 2007), Imazeki (2006), Imazeki and Reschovsky (2004a, 2004b), and Reschovsky and Imazeki (2003). 
Imazeki and Reschovsky, in their various publications about costs in Texas alternately use an efficiency index 
derived from a data envelope analysis (DEW), including a Hefindahl index, or ignore the issue. 

8 Figures 2-5 are based on the Duncombe data and analyses. The Baker data and analyses are very similar, and the 
corresponding figures are available upon request. Both studies pool data across several years, although these 
diagrams only depict one year. 

9 Similarly, there is a very wide horizontal range: at any given spending level, performance varies widely. 

10 In fact, if these data suggest any relationship at all, it is U-shaped, rather than linear, which means a negative 
relationship between performance and spending over the lower ranges of performance and a positive relationship 
over the higher ranges. The linear relationship has an R 2 of only 4%, the portion of the variation in spending 
accounted for by variations in performance. A quadratic relationship, depicting the U-shaped curve, provides a 
much better fit, with an R 2 of 30%. 
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2) Purge inefficiency from the estimates of spending. This is the key step in converting a 
spending function to a “cost” function. It does so by standardizing the values of the 
“efficiency controls” used in step (1). If successful, this procedure would identify the 
minimum expenditure required to perform at the current level. 

3) Estimate the cost of raising performance to the target level . This involves using the 
estimated relationship between cost and performance to predict the cost associated with 
increasing performance to a set goal. It requires a reliable estimate of the relationship 
between cost and performance from step (1). 

As we shall see below, the cost function methodology does not succeed in this agenda. In order 
to understand the issues more fully, we provide a detailed discussion of these steps: (1) 
controlling for district characteristics, (2) purging inefficiency from average spending, and (3) 
estimating the additional cost for additional performance. 



The Econometrics of Spending Equations: Controlling for District Characteristics 

The basic technique of linear regression is illustrated by the line through the data in Figure 2. 
Each point on the line represents the best estimate for average spending among districts with any 
given test score. Dots above the line (blue) denote districts spending above the estimated 
average and dots below the line (red) denote districts spending below the estimated average, for 
any given test score. 

The technique of multiple linear regression is conceptually identical, except that it adds more 
variables with which to predict spending. The additional variables cannot be depicted 
graphically in two dimensions, but the idea of adding variables to an equation is straightforward: 

spending it = J3 () + /?, • performance it 

+ j3 2 • ( teacher _ salaries) it + /?, ■ (% FRL) it +... + /?, ■ (prop _ val ) it + ... + u it 

Spending in district i and year t is specified to depend on student performance, teacher salaries 
(as the key input price), percent FRL, other demographic and school variables (such as school 
size), and a set of “efficiency controls” such as property values (discussed in the next section). 
The unexplained component, Uj t , is the error term representing factors not captured by the 
measured attributes. It can be positive or negative, but has an average value of zero. The 
regression estimates the coefficients fo, fli, f> 2 , etc., to provide the best fit to the data, minimizing 
the variation in the estimated error term. 1 1 

The key point here is that the resulting equation is a spending equation which gives an estimate 
of average spending for a district of given performance and other characteristics. There is 
nothing controversial about this statement - the cost function practitioners would agree, since 

1 1 In the interest of simplicity, the text omits a number of technical details. For example, these equations are often 
estimated in logarithmic form for the dependent variable and some independent variables. Also, typically the 
estimation uses instrumental variables for the performance variable (and perhaps others, such as teacher salaries), as 
will be discussed in a later section. 
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this is only the first step in their estimation of the cost, or minimum spending necessary to 
produce a given level of performance. 

We will defer discussion of the key coefficient on performance, /6, to a later section, but some of 

the other coefficients are readily interpreted. The estimated coefficient /?? represents the 

additional spending, on average, among districts with higher percentages of FRL students, 

holding other variables constant. In essence it indicates what districts with different levels of 

poverty are spending. It does not represent the extra cost required to achieve any given 

performance level for FRL students. All a positive /^coefficient in equation (1) would reflect is 

a tendency of either the state or the district to spend more heavily when there is a greater 

proportion of students in poverty, while any similar tendency to spend less on poor students 

would yield a negative coefficient. This interpretation of /?? holds regardless of whether extra 

12 

spending is required to increase performance or is effective at doing so. 

The distinction is quite important, because coefficients estimated from such equations are 

1 T 

regularly adduced to specify cost premiums (or student weights) in school funding formulas. 

For example, in Missouri, the estimate of 03 was taken to mean that a student receiving a 
subsidized lunch in an average district is over 50% “more expensive than a student not receiving 
a subsidized lunch to bring up to the same performance level,” an interpretation that goes beyond 
what is warranted from a spending equation. 14 

The interpretation of demographic coefficients is further illustrated by variables for race. As an 
example, the estimate by Baker (2006b) of the extra spending for black students in Missouri was 
70 percent. 15 The direct interpretation of this equation is that Missouri spends more on average 
in districts with higher concentrations of black students (controlling for FRL, etc.). This is 
consistent with Missouri’s history of mandated remedies in prior desegregation cases. But, since 
these estimates are drawn from spending equations (not cost equations), it is an over- 
interpretation to conclude that these coefficients represent the required extra cost for black 
students to achieve any given level of performance. 

Consider next the control for teacher salaries. The idea here, drawn from the theory of 
competitive markets, is that if important input prices are beyond the producer’s control, they are 
an independent determinant of cost. For such input markets, the producer is said to be a “price- 
taker.” However, it is highly questionable whether such conditions are reasonably satisfied by 
teacher markets. While much of the variation in teacher salaries across districts is correlated 
with the wages of non-teaching college graduates in the region (labor market), within regions 
districts vary meaningfully in salaries they pay to teachers. This within region variation draws 



12 To be sure, it is uncontroversial that higher FRL is associated with lower district performance; but the statistical 
evidence that extra spending systematically raises performance over the observed range is highly controversial. 
Student-level data from Missouri indicates no relationship between spending and performance of African-American 
FRL students (Podgursky, Smith, and Springer (2007), Figure 10). 

13 Duncombe and Yinger (2005). 

14 Duncombe (2007), p. 24. 

15 Because of the specific functional form, the estimate varies modestly depending on the percent of students that are 
black. The estimate given above is for the average district in the state, while for St. Louis, the figure is 85% (Baker 
(2006b)). The estimate in Duncombe (2007) also implies a very substantial premium, but because of the way that 
race entered the equation (interacted with FRL) the interpretation is less straight-forward. 
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into question the “price-taking” assumption. 16 Consequently, district variation in teacher salaries 
likely includes discretionary variation, not simply variation in cost. This problem is recognized 
by some of the cost function practitioners, and their attempted solution is discussed in a later 
section (on instrumental variables). The point here is that input pricing illustrates the difficulty 
in adapting cost function estimation from competitive markets to the very different environment 
of public education. 

To see the effect of all the controls in (1) taken together, consider each district’ s fitted value for 
spending. This is the value for each district using the estimated /Ts in (1) and setting the error 
term to zero. It represents the estimated spending for the average district of that specific 
district’s characteristics. In the simple case of Figure 2, where performance was the only right- 
hand-side variable, the fitted value is represented by the line and the actual value is represented 
by the dots. The difference between actual spending and fitted spending is the distance from the 
dots to the line (also known as the residual). 

How is this affected by the addition of all the explanatory variables in (1)? The answer is seen 
in Figure 3. The deviation of actual spending imm fitted spending - the amount that each 
district differs from the regression line - is depicted on the vertical axis, plotted against the 
district’s performance. In effect, Figure 3 replicates Figure 2 except that instead of plotting 
actual spending on the vertical axis, it plots spending adjusted for performance and other district 
characteristics including student poverty, race, teacher salaries, etc. 

For St. Louis, the effect of these controls is striking. In Figure 2, St. Louis was the highest 
spending among districts with comparable test scores. In Figure 3, St. Louis is among the 
lowest-spending of these districts, after controlling for district characteristics. St. Louis is a 
very large district that has high levels of FRL and of percent black students that go along with its 
high spending. Thus, after adjusting for these other factors, Figure 3 indicates that St. Louis 
spends a bit below (but quite close to) the average of what would be predicted based on Missouri 
spending patterns. 

St. Louis is far from alone in spending below the estimated average of comparable districts: 
approximately half the districts in the state fall in the same category, as Figure 3 shows. This is 
true by definition of averages; since Lake Wobegon is not located in Missouri, about half the 
districts will be above average and about half below average. The same logic that holds for 
simple averages carries over to the regression methodology, which estimates average spending 
among comparable districts. The large number of deficits we saw in Figure 2, for simple 
regression, appears again in Figure 3 - by construction. To interpret these shortfalls from the 
average as an adequacy shortfall would be logically absurd, since it would mean there is always 
an adequacy shortfall among about half the districts, no matter how high or low spending is. To 



16 The collective bargaining environment is a textbook case of the violation of the competitive price-taking 
assumption for inputs, as the impersonal forces of the market are replaced by relative bargaining power. 

17 Some practitioners (including Baker (2006b) use regional cost indices instead of teacher salaries. This avoids 
some of the difficulties discussed above, but may only weakly reflect prices faced by districts. 

18 Duncombe’s equation includes performance (instrumented), teacher salaries (instrumented), % FRL, % FRL x % 
black, % SPED, indicator for K-12 district, a set of indicators for district size, property values, district income, state 
aid relative to income, % college educated, % age 65 or older, % housing units owner occupied, median housing 
price relative to average property values, and a series of year indicators. 
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be sure, these deviations are not quite the adequacy shortfalls implied by the cost function - that 
requires one further step - but, as we shall see, the nature of those shortfalls is very largely 
determined by the deviations shown in Figure 3, results which follow ineluctably from the logic 
of averages. 

Although the statistical controls do not affect the fact that about half the districts spend above 
and below average, they do affect the size of the deviations. Comparing Figures 2 and 3, we see 
that the controls help account for some part of the spread in spending over the upper and lower 
ranges of test scores, but did not much reduce the estimated spread in the mid-range. The spread 
containing the bulk of these districts remains about $2,000-$3,000, as it is over much of the test 
score range. In short, using statistical controls for observable district characteristics helps to 
identify some spending patterns (e.g. by FRL and race), but still leaves unexplained a wide range 
of spending among districts of similar observed characteristics and performance. 19 



The Econometrics of Spending Equations: Controlling for District Efficiency 

To convert the spending equation to a cost function one needs to identify the minimum 
expenditure necessary to achieve any given level of performance - the definition of efficient. As 
Duncombe (2007) points out, “Because data is available on spending, not costs, to estimate costs 
of education requires controlling for differences in school district efficiency (p.3).” 

It is increasingly common to deal with this issue by including “efficiency controls” — variables 
which are thought to affect efficiency — among the explanatory variables in the spending 
equation (1).“ Unfortunately, there is no line item in budgets for “waste, fraud, and abuse.” 
Moreover, if it were obvious what factors determined inefficiency in schools, local and state 
citizens and authorities would be likely to take actions to correct the inefficiency. Thus, the 
quest for a set of observed and measurable factors that convert the spending functions into cost 
functions by separating inefficiencies from required spending is obviously difficult. 

As one example of using efficiency controls, Duncombe’ s equation for Missouri includes seven 
“efficiency-related variables,” categorized as either “fiscal capacity” variables, such as per pupil 
property values, income, and state aid, or “monitoring variables,” such as percent of population 
aged 65 or more and percent of college-educated adults in the district. The argument here is that 
districts with greater “fiscal capacity” may experience less pressure to be efficient (or a greater 
inclination to spend on non-tested subjects), and that older or college-educated voters may exert 



19 This variation could be the result of inadequate controls for true differences across districts. For example, the 
percent of students eligible for free or reduced-priced lunch is likely to be a poor measure of the variation in 
resources that students receive at home across districts, especially across relatively high-poverty districts. Yet, these 
coarse measures are often the only measures available to researchers or to those designing and implementing school 
funding formulas. However, as previous analyses of achievement show, even with exceptional measures of district 
characteristics, much of the variation in achievement for districts with the same spending is likely to remain. 

2,1 Other methods have also been used, which attempt to identify statistically the points at or near the bottom of 
figures comparable to Figure 3. These methods, stochastic frontier analysis and data envelopment methods, have 
been used by Duncombe and others in earlier papers, but recent work, including that presented in court, focuses on 
the method discussed in the text. See, for example, Grosskopf, Hayes, Taylor, and Weber (1997), Duncombe, 
Ruggiero, and Yinger (1996). 
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greater “monitoring” for efficiency. No analysis - within this paper or elsewhere - directly 
relates any of these variables to efficiency - that is just a maintained hypothesis. In a similar 
analysis for California districts, Imazeki (2006) includes “efficiency controls,” but focuses on 
local competition instead of fiscal capacity or monitoring as her measure of efficiency, using the 
Herfindahl index for the number of districts in the labor market. 

These variables are simply added into the spending equation (1). At this point, the equation is 
still a spending equation - all that has been done here is to single out a subset of the explanatory 
variables. A district’s age, education, income, property values, the competition it faces, etc. may 
well affect spending patterns, over and above the student demographics and other variables, and 
the estimation of (1) sheds further light on those patterns. One may or may not choose to 
interpret these variables as controls for efficiency (and, if so, they are certainly imprecise 
controls), but either way (1) remains a spending equation. 

The typical procedure used to convert (1) from a spending equation to a cost function is to 
standardize the level of efficiency across districts by setting the values of the efficiency variables 
at uniform levels, rather than the actual district-specific values, and setting the error term to zero 
as given by Equation (2). 

{'cost" of achieving current performance ) it = jB 0 + /?, • performance it 

+ ff • ( teacher _ salaries) it + J3 3 ■ ( %FRL) it + ... + /?„ ■ (ave _ prop_val ) + ... 

It is common in these cost-function analyses to set the value of the “efficiency controls” (such as 
property values per pupil) at the statewide average. Setting the error term to zero, of course, is 
also choosing the average. What this means is that about half the districts will be found to spend 
more and half less than the estimated “cost” of achieving at their current performance levels. 

This result is depicted in Figure 4, which presents the difference between each district’s actual 
spending and the estimated “cost” of achieving its actual performance level. 

How are these figures to be interpreted? Spending for a district can be higher than cost because 
that district may not be using its resources wisely for maximizing the test performance of 
students. It is, on the other hand, logically impossible for a district to spend less than the 
minimum necessary to achieve actual performance levels. It would be one thing to recognize 
that “cost” may be imperfectly estimated and there could be a few outliers. But the estimation 
technique here systematically determines that spending is less than “cost” for about half the 
districts. 21 

Let us be clear on the source of the problem. One might think that the problem is the use of 
average values for the efficiency variables, rather than values that imply something closer to 
maximum efficiency (minimum spending). This is a valid criticism, but in fact the problem lies 
deeper. 



21 Cost function analysts acknowledge that they are only estimating “average efficiency,” a term that would seem to 
modify the definition of cost. However, they continue to state that the estimated cost figures represent what is 
“necessary” or “required” to achieve any given result, which effectively restores the original definition. Figures 4 
and 5 use the “required” terminology, from Duncombe (2007). 
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The primary source of the problem is that the “efficiency controls” do little to explain the 
variations in spending, and are rarely convincing measures of the full range of efficiency. The 
deviations depicted in Figure 4 have netted out the estimated effect of these variables on 
efficiency, but are still quite large. The step that purportedly converts the spending equation (1) 
to the “cost” equation (2) has very little effect on this fundamental problem. 

Taking St. Louis as an example, the set of seven “efficiency” variables from Duncombe (2007) 
taken together tends to raise St. Louis spending above districts with average values of those 
variables, so the calculated “cost” using those averages is a bit lower than the fitted value in (1). 
Consequently, St. Louis’ slight deficit depicted in Figure 3 becomes a slight surplus in Figure 4: 
St. Louis spends slightly more than is “required” to achieve its actual test scores. As this 
example illustrates, for most districts there is not much difference between Figures 3 and 4. The 
interpretation placed on Figure 4 by the cost function methodology, however, is totally different 
- cost vs. spending. This re-interpretation is not defensible. 

In short, the method that purports to convert average spending to cost does nothing of the sort. 
The adjustment from the “efficiency controls” is minor, not surprising given that it would be 
difficult to argue that these variables do a good job of measuring true variation in efficiency. 

The major step is that the deviations depicted in Figure 3 - deviations from average spending of 
comparable districts - are simply redefined as deviations from “cost.” That is why the “cost” 
estimates carry the logically incoherent implication that half the districts spend less than is 
necessary to achieve what they have achieved. 



Extrapolating from the “Cost Function” to a Different Performance Level 

The third step in calculating the cost of adequacy is to apply the estimated cost function to a 
target performance level. This step is accomplished by simply replacing actual performance with 
target performance in the calculation: 

(" cost" of achieving target performance) it = J3 () + /?, ■ (target _ performance) 

+ /? 2 ■ (teacher _ salaries) it + (3, ■ (%FRL) jt + ... + J3 n - (ave _ prop_val ) + ... 

For example, one of the targets considered in Missouri was to raise St. Louis from its current 
level of 16.7 to a level of 26. 3. 22 If we apply this target to all districts (not just St. Louis), the 
result is to raise the “cost” for those districts below 26.3 and to reduce it for those districts above, 
relative to their estimated cost of current performance, provided that the estimate of/L is 
positive. This obviously increases the estimated shortfall from required spending for the former 
and reduces it for the latter. This imposes a substantial “tilt” on Figure 4, pushing down the 
points on the left side of the diagram and pushing up the points on the right. 

The result is Figure 5, depicting actual spending vs. “required” spending to achieve the 
performance target of 26.3. These estimates redistribute the shortfalls from higher-performing 



22 Duncombe (2007 ) identifies this as the Missouri School Improvement Program (MSIP) standard for St. Louis in 
2008. This target happened to be near the state average in 2005, of 25.6. 
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districts to lower-performing ones." For example, St. Louis was depicted in Figure 4 as 
spending slightly above what was “required” to achieve its current level, but Figure 5 depicts St. 
Louis as $1,541 below what is “required” to achieve the higher target. Districts with lower 
performance are adjusted even further, to yield estimated shortfalls of over $4,000. 

To assess whether the estimates of cost in Figure 5 are valid, we must directly assess the two key 
features of the cost estimates: (1) the methodology for estimating the “cost” of generating 
current outcomes - which we have already seen is fundamentally flawed - and (2) the estimated 
coefficient /0 which is applied to that base, to generate the “cost” of target performance." The 
estimate of /0 is key to the whole exercise, so it is critically important that it is estimated 
accurately, with a high degree of confidence, and that it not be sensitive to arbitrary choices in 
model selection. Unfortunately, there are several reasons why this standard is not met. 



Imprecision in Estimated Coefficients, and in Estimated “Cost” 

The first problem is that the regression coefficients are often estimated with relatively wide 
confidence intervals, even assuming that the model is correctly specified and appropriately 
estimated (assumptions we revisit below). For example, Duncombe’s estimate of /?; is that costs 
rise by 0.39 percent for every 1 percent increase in performance. However, the 95 percent 
confidence interval ranges from 0.07 percent to 0.71 percent, spanning a factor of 10. Similarly, 
in the study of California districts, the 95 percent confidence interval for Imazeki (2006) 
estimates range from 0.05 percent to 0.63 percent. Even if everything else is correct, one can 
have very little confidence in the adjustments that lead to estimates of needed costs, moving from 
Figure 4 to Figure 5. 

The problem of wide confidence intervals applies to the other coefficients as well, which is a 
matter of some importance for the issue of demographic cost premiums. For example, 
Duncombe’s estimate of /T? implies a premium for FRL students of 52 percent, but the 95 percent 
confidence interval is 27 percent to 80 percent. Similarly, there is an implied premium for 
students in special education of 49 percent, but the 95 percent confidence interval is 19 percent 
to 80 percent. Again, these are very wide confidence intervals, and even they assume everything 
else is estimated correctly. 

The imprecision in ah the estimated coefficients, along with the estimated variance in the 
unexplained component of (1), contribute to wide confidence intervals in the estimated “cost” of 
meeting performance targets. For St. Louis, the estimated “cost” of performing at a level of 26.3 
in 2005 is $11,597. However, the 95 percent confidence interval is from $8,367 to $16,074. 



23 The fact that the process starts from a logically flawed base can still be seen in Figure 5, by examining the large 
number of red dots to the right of the vertical line. These are districts that are found to spend less than “required” to 
meet the standard that they are already meeting. 

24 Duncombe (2006) and Baker (2006a) have argued that the upward tilt in diagrams such as this in Kansas 
(Duncombe) and other states (Baker) provide some evidence in support of the approach’s statistical validity (albeit a 
“fairly weak validity test” in Duncombe’s view). However, as our step-by-step derivation shows, the tilt simply 
reflects the estimated sign of/?;. The point is that any positive estimate of Pi, even if it is highly problematic (for 
reasons such as those discussed in the next section), will necessarily generate a positive tilt in a diagram such as 
Figure 5. Thus, a positive tilt is of no independent value in assessing the validity of the cost function estimates. 
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Since this interval contains the current level of $10,056, one cannot conclude that spending is 
inadequate to achieve that target at conventional confidence levels, even if the rest of the analysis 
is solid. In addition to the problems identified above, however, there are special problems with 
estimating /?/, to which we now turn. 



Special Problems with Estimating the Cost of Raising Performance 

There is a long history of trying to estimate the relationship between average spending and 
average performance, and it is not an encouraging one. For decades, it has proven difficult to 
find a systematic relationship, and the problems that have plagued that research also pertain to 
the cost function estimates. For one thing, the control variables are imperfect, the choice of 
variables is arbitrary in some cases, and the estimates are often sensitive to that choice. 

More importantly, it may be that spending affects performance, as opposed to the opposite that is 
assumed in the spending relationships that are estimated. Indeed, the whole theory of the court 
case is precisely that - that providing more resources leads to higher achievement. The 
implications of this are very serious for the estimation of the spending/cost relationships, because 
/i^ will now reflect both effects even though just the impacts of achievement on costs are desired. 
A related problem is the worry of omitted variables that comes from the possibility of a third 
factor such as parents’ interest in education affecting both spending and achievement. Both of 
these problems give reason to believe that/?; is likely to be estimated with bias. 

Cost function analyses often try to use instrumental variables techniques to reduce bias. 

However, the requirements of this technique are difficult to fulfill and cost functions to date have 
not utilized convincing instruments. 

Finally, the estimates differ dramatically depending on the specification, whether spending is 
modeled as a function of achievement or achievement is modeled as a function of spending. 

(i) Sensitivity to Selection of Other Variables 

The first problem is that the results are often highly sensitive to which variables are included in 
the model. For example, in both the Duncombe and Baker models for Missouri, the results are 
highly sensitive to the inclusion of race. If race is excluded from the model (as it surely would 
be, if it were to be used for an actual funding formula), the coefficient on performance, /?; is no 
longer statistically significant, which is to say the 95 percent confidence interval includes zero. 

Similarly, estimates in Baker (2006b) are highly sensitive to which “efficiency controls” are 
included in the estimating equation. His data set contains six such variables - similar to those 
used by Duncombe - though he selects only four of them. Among the 64 possible combinations 
of those six controls, the [f estimate is statistically indistinguishable from zero almost half the 
time, and in most of those cases the model’s “fit” is better than the one chosen by Baker. One 
cannot have much confidence in any single estimate of /?; if both the estimates and the 
confidence intervals are so highly sensitive to arbitrary choices in model specification. 
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These sensitivities are found in other states as well. Results provided in Duncombe (2006) show 
that the estimate of /?/ for Kansas loses its statistical significance if an interaction term is omitted 
(free lunch multiplied by pupil density). If the time period 2000-2004 is broken up into 2000-1 
and 2003-4, the estimate for/0 doubles between these subperiods, but for neither period is it 
statistically significant. 

(ii) Endogeneity Bias, Omitted Variables Bias, and “Instrumental Variables” 

A second problem is statistical bias due to mutual causation between spending and achievement 
(“endogeneity bias”) and/or omitted variables that are likely to affect, or at least be correlated 
with, both spending and performance. For example, some districts are more education-oriented 
than others, simply due to the gathering of like-minded citizens, with specific characteristics that 
are not captured by the observable variables. These districts may tend both to spend more and to 
have more highly performing children. If so, then the relationship between spending and 
performance will be biased upwards, since their statistical association will be picking up in part 
the effect on each of them of the unobserved degree of education-orientation. 

The usual solution to this problem is a technique known as “instrumental variables.” Under this 
technique, “performance” is considered a “troublesome explanator” for spending and does not 
actually enter into the estimating equation (l). 25 Instead a proxy variable or set of variables is 
used, known as “instruments.” The idea is that instead of using variation in achievement that 
could be a result of a third variable that also affects spending and thus is subject to bias, this 
technique uses only the variation in achievement that comes from a known source that does not 
independently affect spending. The theory of this approach is compelling; however, in practice it 
is rarely well implemented. The problem is that this technique has some very stringent 
requirements, which are rarely met. In the context of cost function estimation, it is very difficult 
to identify variation in achievement that is the result of factors that do not independently 
influence spending. If these conditions are not met, the instrumental variable solution to the 
problem of bias can easily make the problem worse. 26 

There are statistical tests that provide some defense against using invalid instruments, and at a 
minimum the cost functions should pass the relevant test. These tests are weak, since they have 
to assume that some of the instruments are valid, in order to test whether all of them are; yet, in 
the case of Missouri, the instruments failed these tests for both cost functions submitted to court. 
Thus, the adequacy estimates were not only methodologically flawed, but statistically invalid. 

In addition to the problem of invalid instruments, which lead to biased estimates, there is an 
additional problem of weak instruments - proxy variables that are only weakly correlated with 
performance. This leads to an overstatement of the statistical significance of the performance 
coefficient. In other words, the claim that /?/ - the key coefficient in the whole exercise - is 
statistically distinguishable from zero, is often undermined by weak instruments. A final 
difficulty in the instrumental variables approach is that the choice of instruments may be 



25 



When “teacher salaries” is used as an input price control (as in Duncombe (2007)), it is also treated as a 
“troublesome explanator,” and instrumented. 

26 Murray (2006) 
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somewhat arbitrary and the estimated performance coefficient may be quite sensitive to the 

27 

choice of instruments. 



(iiii) Sensitivity to Specification as “Cost” vs. “Production” Function 

Finally, cost estimates are extremely sensitive to whether spending is modeled as a function of 
achievement or achievement as a function of cost. There are two traditions looking at the 
relationship between student performance and spending: the production function approach and 
the cost function approach. The key difference between the two is whether the focus of attention 
is achievement or spending. Each approach standardizes for a variety of other factors such as 
economic disadvantage of families, district attributes such as population density, and other 
things, and then looks at the remaining correlation of spending and achievement. The difference 
is whether spending is on the left side of (1) and performance on the right (cost function) or 
whether these are reversed (production function). 

The first thing to note is that these two approaches must necessarily be related. After all, they 
look at the relationship between the same basic elements of achievement and spending. Viewing 
them together provides an easy interpretation of the empirical evidence, but unfortunately this is 
seldom done. The one exception, where production function and cost function approaches are 
placed side by side, is Imazeki (2006). Imazeki’s analysis finds that achieving adequacy in 
California is estimated to require additional spending of $1.7 billion if a cost function estimate is 
used and $1.5 trillion if a production function estimate is used - clearly a striking difference. 

Both the cost function and production function estimates show weak and imprecise relationships 
between average district spending and average student achievement, as illustrated in Figure 6 for 
eighth-grade math scores in the 522 districts of Missouri in 2006. After allowing for differences 
in the free and reduced price lunch populations, in the racial composition (percent black), and in 
the number of students, one can plot achievement against spending in a way that uses statistical 
methods to control for the other characteristics mentioned. 

The figure shows that there is a slight upward slope of the spending line, but the dominant 
picture of Figure 6 is (once again) essentially a cloud - where districts with the same spending 
get wildly different achievement. The line has a statistically significant but relatively small 
positive slope of 0.0028 scale points per dollar (t=3.1). The flatness of this line is important: 
spending more money given the current way it is spent yields very little achievement gain. Put 
another way, if one wishes to get a large change in achievement (as discussed below), it will cost 
a very large amount of money, even assuming that this linear relationship can be extended far 
away from the current spending: it costs $357 per pupil to raise achievement one point. 



27 The Missouri cost functions suffered from both problems discussed in this paragraph, although the point was 
somewhat moot since the instruments chosen were invalid. 

28 It should be pointed out that Figures 6 and 7 (below) are not necessarily representative of all student outcome 
measures. If one took a different grade to look at these relationships or looked at reading instead of math, most 
alternatives actually give insignificant relationships between spending and achievement, and frequently they have 
the wrong sign. This might be expected since the regressions are drawing lines through these clouds of points with 
very little shape to the points that allow estimating such a relationship. A few districts performing at a slightly 
different point in the cloud can change the slope of the relationship. 
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Figure 6 is not very encouraging for the proponents of reaching adequate levels of performance 
through solely spending more money. If it requires tripling or quadrupling funds to get students 
to the adequate level, most reasonable people will immediately see that this is not a viable public 
policy. 

But there is another way of looking at the data. By looking at how spending varies with 
achievement - the cost function approach that we have been discussing above - the picture looks 
far more manageable. Figure 7 turns the previous picture on its side and looks at the amount of 
spending as a function of achievement (after allowing for the same factors of free and reduced 
price lunch, race, and district size). Again the dominant feature is the cloud of districts that 
spend very different amounts to reach any given performance level. But now the line that goes 
through the points tells a very different story. It is flat once again, but this now indicates that one 
can move across very large achievement levels at modest cost. The regression coefficient 
indicates that each $6.62 raises achievement one point. 

These regression coefficients reflect the same data - they both have identical t-statistics of 3.13 - 
but they differ dramatically on the estimated cost of raising achievement: $357 per point vs. 
$6.62 per point, a factor of 54 (and of course this ignores the wide confidence intervals around 
each of these estimates). The ultimate reason that these estimates differ so much, even though 
they use the same data, is that the fit is not very tight. If the fit were perfect, the estimates 
would coincide: turning the diagram on its side would not only turn the dots on their side, but 
also the line. However, when the fit is so weak, each diagram will generate a flat curve, because 
they are each minimizing the variation in error terms measured vertically. 

The cost function makes it appear that it is much more feasible to change achievement by simply 
spending more with the current schools and the current institutional arrangements. For example, 
in Missouri, the average score on math is 733 and proficiency is defined as 800, so there is a gap 
of 67 points. Under the “cost” function estimate, it costs 67 x $6.62 = $443 per student to close 
the gap. Under the production function estimate, the “cost” is $23,919. When the estimates 
vary so wildly from two equally defensible ways of looking at the data - neither one of which 
finds a strong relationship - it is hard to place much credence on either estimate. 



Conclusion 



Determining the dollars necessary to provide an adequate education is not an easy task. The 
commonly employed technique of using professional judgment to design prototype schools is far 
from satisfying. Case studies of particularly successful schools may provide insights into 
effective approaches but are also unsatisfying because success is often the function of 
particularly dynamic leadership or teaching that may be difficult to replicate under current 
institutional arrangements. Regression-based approaches, often called “cost function analyses,” 
provide a superficially attractive alternative because they apply seemingly objective methods to 
data on district spending and achievement to determine the cost of reaching standards. 
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While on first blush the regression-based approaches are appealing, on further exploration, as 
discussed above, they are fraught with problems, revealing very little about the cost of improving 
student achievement. The issues facing regression-based models are of two over- arching types: 
technical problems that skilled analysts with sufficient data can correct in their models and 
conceptual problems that bring the overall approach into question. 

Given sufficient data, a skilled analyst can estimate a regression-based model to produce 
informative estimates of the spending patterns, by district characteristics and outcomes. Even 
the most skilled analyst, however, will typically find “cost” estimates that are highly imprecise, 
sensitive to judgment calls in modeling, and subject to bias. 

The underlying difficulty is that even after controlling for a host of variables (including labor 
market prices, student and school characteristics, among other variables) there is still a great deal 
of variation across districts in their outcomes for students, in districts with the same 
expenditures. There are a number of reasons for these differences that draw the regression-based 
approaches into question. In particular, we have little way of knowing how much of these 
differences are driven by unobserved cost or price differences, by mismanagement, or by a focus 
on goals other than the student achievement measures used in the cost functions. 

Cost function analysts are aware of these problems. They use efficiency controls and 
instrumental variables approaches to adjust for these difficulties. However, as demonstrated 
above, in practice both approaches fall woefully short of convincing. We simply do not have 
good measures of efficiency. The proxies that have been used are, at best, weak measures of 
efficiency with substantial measurement error, and measurement error itself creates bias. 
Instrumental variables can, in theory, address the biases due to omitted variables and mutual 
causation, but, in practice no researchers have identified strong and valid instruments. Weak and 
invalid instruments have been shown repeatedly to overstate statistical significance and to 
increase bias rather than mitigate it. 

The usual practice of identifying “cost” as the average spending among comparable districts 
always yields the logically impossible result that about half the districts spend less than is 
required to achieve what they have achieved. This problem has practical implications as well. 

If courts and policy-makers accept a methodology that defines minimum expenditures by 
averages , they will then have to raise the expenditures of those below the average, thereby 
raising the average again. This methodology is a recipe for perpetual findings of inadequacy 
under forever-recurring litigation. 

The failure of regression-based approaches to identify the cost of adequacy is nowhere as clear 
as when comparing the results of spending as a function of achievement to those of achievement 
as a function of spending. Cost functions assume that spending changes as a function of 
achievement; but it makes just as much sense, if not more in the case of education, to assume that 
achievement changes as a function of spending. A comparison of these two approaches, 
however, produces vastly different estimates with vastly different implications for policy if 
interpreted as identifying the causal effect of spending on achievement. Of course, such an 
interpretation is not warranted. 

The “cost function” approach simply does not identify the causal relationship between spending 
and achievement. This failure should not be surprising. We would not need randomized 
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experiments or detailed longitudinal data on student learning to estimate the effects of resources 
if this could be done so simply with district-level data on spending and average student 
achievement. However, while not surprising, the problems with regression-based approaches do 
highlight the difficulty of basing school finance decisions on currently available estimates of the 
cost of adequacy. All techniques for estimating the cost of adequacy are seriously flawed. None 
of them can provide a convincing cost figure. 

At best, each method provides some limited information - the current distribution of spending 
and achievement, the cost of a variety of prototype schools, the activities and expenditures in 
some particularly successful schools. This information can be better than no information for 
what is ultimately a political decision of how much to spend, but it cannot provide a dollar figure 
that will guarantee student success or even the opportunity for student success. The most 
important lesson that emerges from the data - with its wide variation in achievement for 
comparable expenditures - is that how money is spent is crucial for determining student 
outcomes. Educational excellence requires a system with the knowledge, professional capacity, 
incentives and accountability that will lead schools to determine how to spend their funds most 
effectively to raise student achievement and reach the variety of goals we have for students. 
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Figure 1. Missouri District Average 8 th Grade Mathematics Scores and District Spending: 2006 
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Figure 2: Expenditures vs. Performance in Missouri, 2005 
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Figure 3: Actual vs. Fitted Spending, with Controls for District Characteristics 
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Figure 4: Actual vs. “Required” Spending to Achieve Current Performance 




"required" spending calculated from Dr. Duncombe's equation, Table 2. Districts with enrollment < 350 excluded. 
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Actual spending - "required" spending to meet 26.3 in 2005 
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Figure 5: Actual vs. “Required” Spending to Achieve Target Performance 
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Figure 6. “Production Function” Relationship between 8 th Grade Math and Spending 
(holding constant race, enrollment, and free or reduced lunch eligibility), Missouri 
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Figure 7. “Cost Function” Relationship between Spending And 8 th Grade Math 
(holding constant race, enrollment, and free or reduced lunch eligibility), Missouri 
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